CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

https://arxiv.org/abs/2010.00133

To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs).

CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age.